A Summary and Comparison of Two Approaches for Determinization of Lattices
نویسندگان
چکیده
In many applications of speech and language processing, we generate intermediate results in the form of a lattice on which we apply finite-state operations. For example, we might POS tag the words in an ASR output lattice as an intermediate stage for language modeling. Currently, we have to convert the lattice into n-best scoring sub-lattices (one sub-lattice per unique input sequence), analyze each sub-lattice separately to get its best-scoring output sequence, and combine the resulting output sequences back into a lattice. We introduce two methods to eliminate the need for this unnecessary conversion by computing only the 1-best scoring output sequence for every input sequence in the lattice. This problem arises in any finite-state tagging task such as POS tagging, word segmentation, named entity recognition, as well as in discriminative training when we need to extract the best time boundaries, acoustic, pronunciation, or language model scores in an ASR lattice. Shafran et al. [1] and Povey et al. [4] independently proposed two solutions for the above problem. Obviously, selecting the n-best scoring sequences in the lattice is not a solution, because the result may contain more than one analysis for some input sequences while discarding all analyses of some other input sequences. Regular transducer determinization does not solve the problem either: for one thing, a transducer may include several different output sequences for a given input sequence, and still be deterministic as a transducer. Instead, in recently published work, Shafran et al. and Povey et al. define novel semirings in a way that determinization preserves only the best-scoring output sequence. In sections 2 and 3 we briefly introduce their methods. In section 4, we explain our ongoing research, which is a follow-up on the previous independent projects. We are investigating the similarities and differences between the two approaches and comparing them under the same conditions.
منابع مشابه
Alternative approaches to obtain t-norms and t-conorms on bounded lattices
Triangular norms in the study of probabilistic metric spaces as a special kind of associative functions defined on the unit interval. These functions have found applications in many areas since then. In this study, we present new methods for constructing triangular norms and triangular conorms on an arbitrary bounded lattice under some constraints. Also, we give some illustrative examples for t...
متن کاملThe Impact of Summary Writing with Structure Guidelines on EFL College Students’ Rhetorical Organization: Integrating Genre-Based and Process Approaches
This study aimed at investigating the impact of writing on Iranian EFL college students’ rhetorical organization. Thirty Iranian female undergraduate students majoring in English at Al-zahra University participated in the current study. The writing instructions included two stages, each lasting for four weeks. The participants were assigned to a control group and an experimental group according...
متن کاملFrankl's Conjecture for a subclass of semimodular lattices
In this paper, we prove Frankl's Conjecture for an upper semimodular lattice $L$ such that $|J(L)setminus A(L)| leq 3$, where $J(L)$ and $A(L)$ are the set of join-irreducible elements and the set of atoms respectively. It is known that the class of planar lattices is contained in the class of dismantlable lattices and the class of dismantlable lattices is contained in the class of lattices ha...
متن کاملThe effect of material nonlinearity on the band gap for TE and TM modes in square and triangular lattices
In this article, by using the method of finite difference time domain (FDTD) and PML boundary conditions, we have studied the photonic band gaps for TE and TM modes in square and triangular lattices consisting of air holes in dielectric medium and compared the results. In addition, the effect of nonlinearity of the photonic crystal background on the photonic band gaps and comparison with the re...
متن کاملMultidimensional fuzzy finite tree automata
This paper introduces the notion of multidimensional fuzzy finite tree automata (MFFTA) and investigates its closure properties from the area of automata and language theory. MFFTA are a superclass of fuzzy tree automata whose behavior is generalized to adapt to multidimensional fuzzy sets. An MFFTA recognizes a multidimensional fuzzy tree language which is a regular tree language so that for e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012